大型的语言模型(PRELMS)正在彻底改变所有基准的自然语言处理。但是,它们的巨大尺寸对于小型实验室或移动设备上的部署而言是过分的。修剪和蒸馏等方法可减少模型尺寸,但通常保留相同的模型体系结构。相反,我们探索了蒸馏预告片中的更有效的架构,单词的持续乘法(CMOW),该构造将每个单词嵌入为矩阵,并使用矩阵乘法来编码序列。我们扩展了CMOW体系结构及其CMOW/CBOW-HYBRID变体,具有双向组件,以提供更具表现力的功能,在预绘制期间进行一般(任务无义的)蒸馏的单次表示,并提供了两种序列编码方案,可促进下游任务。句子对,例如句子相似性和自然语言推断。我们的基于矩阵的双向CMOW/CBOW-HYBRID模型在问题相似性和识别文本范围内的Distilbert具有竞争力,但仅使用参数数量的一半,并且在推理速度方面快三倍。除了情感分析任务SST-2和语言可接受性任务COLA外,我们匹配或超过ELMO的ELMO分数。但是,与以前的跨架结构蒸馏方法相比,我们证明了检测语言可接受性的分数增加了一倍。这表明基于基质的嵌入可用于将大型预赛提炼成竞争模型,并激励朝这个方向进行进一步的研究。
translated by 谷歌翻译
This article charts the work of a 4 month project aimed at automatically identifying patterns of tweets popularity evolution using Machine Learning and Deep Learning techniques. To apprehend both the data and the extent of the problem, a straightforward clustering algorithm based on a point to point distance is used. Then, in an attempt to refine the algorithm, various analyses especially using feature extraction techniques are conducted. Although the algorithm eventually fails to automate such a task, this exercise raises a complex but necessary issue touching on the impact of virality on social networks.
translated by 谷歌翻译
The detection of anomalies in time series data is crucial in a wide range of applications, such as system monitoring, health care or cyber security. While the vast number of available methods makes selecting the right method for a certain application hard enough, different methods have different strengths, e.g. regarding the type of anomalies they are able to find. In this work, we compare six unsupervised anomaly detection methods with different complexities to answer the questions: Are the more complex methods usually performing better? And are there specific anomaly types that those method are tailored to? The comparison is done on the UCR anomaly archive, a recent benchmark dataset for anomaly detection. We compare the six methods by analyzing the experimental results on a dataset- and anomaly type level after tuning the necessary hyperparameter for each method. Additionally we examine the ability of individual methods to incorporate prior knowledge about the anomalies and analyse the differences of point-wise and sequence wise features. We show with broad experiments, that the classical machine learning methods show a superior performance compared to the deep learning methods across a wide range of anomaly types.
translated by 谷歌翻译
Riemannian geometry provides powerful tools to explore the latent space of generative models while preserving the inherent structure of the data manifold. Lengths, energies and volume measures can be derived from a pullback metric, defined through the immersion that maps the latent space to the data space. With this in mind, most generative models are stochastic, and so is the pullback metric. Manipulating stochastic objects is strenuous in practice. In order to perform operations such as interpolations, or measuring the distance between data points, we need a deterministic approximation of the pullback metric. In this work, we are defining a new metric as the expected length derived from the stochastic pullback metric. We show this metric is Finslerian, and we compare it with the expected pullback metric. In high dimensions, we show that the metrics converge to each other at a rate of $\mathcal{O}\left(\frac{1}{D}\right)$.
translated by 谷歌翻译
Insects as pollinators play a key role in ecosystem management and world food production. However, insect populations are declining, calling for a necessary global demand of insect monitoring. Existing methods analyze video or time-lapse images of insects in nature, but the analysis is challenging since insects are small objects in complex and dynamic scenes of natural vegetation. The current paper provides a dataset of primary honeybees visiting three different plant species during two months of summer-period. The dataset consists of more than 700,000 time-lapse images from multiple cameras, including more than 100,000 annotated images. The paper presents a new method pipeline for detecting insects in time-lapse RGB-images. The pipeline consists of a two-step process. Firstly, the time-lapse RGB-images are preprocessed to enhance insects in the images. We propose a new prepossessing enhancement method: Motion-Informed-enhancement. The technique uses motion and colors to enhance insects in images. The enhanced images are subsequently fed into a Convolutional Neural network (CNN) object detector. Motion-Informed-enhancement improves the deep learning object detectors You Only Look Once (YOLO) and Faster Region-based Convolutional Neural Networks (Faster R-CNN). Using Motion-Informed-enhancement the YOLO-detector improves average micro F1-score from 0.49 to 0.71, and the Faster R-CNN-detector improves average micro F1-score from 0.32 to 0.56 on the our dataset. Our datasets are published on: https://vision.eng.au.dk/mie/
translated by 谷歌翻译
In today's uncertain and competitive market, where enterprises are subjected to increasingly shortened product life-cycles and frequent volume changes, reconfigurable manufacturing systems (RMS) applications play a significant role in the manufacturing industry's success. Despite the advantages offered by RMS, achieving a high-efficiency degree constitutes a challenging task for stakeholders and decision-makers when they face the trade-off decisions inherent in these complex systems. This study addresses work tasks and resource allocations to workstations together with buffer capacity allocation in RMS. The aim is to simultaneously maximize throughput and minimize total buffer capacity under fluctuating production volumes and capacity changes while considering the stochastic behavior of the system. An enhanced simulation-based multi-objective optimization (SMO) approach with customized simulation and optimization components is proposed to address the abovementioned challenges. Apart from presenting the optimal solutions subject to volume and capacity changes, the proposed approach support decision-makers with discovered knowledge to further understand the RMS design. In particular, this study presents a problem-specific customized SMO combined with a novel flexible pattern mining method for optimizing RMS and conducting post-optimal analyzes. To this extent, this study demonstrates the benefits of applying SMO and knowledge discovery methods for fast decision-support and production planning of RMS.
translated by 谷歌翻译
Adaptation-relevant predictions of climate change are often derived by combining climate models in a multi-model ensemble. Model evaluation methods used in performance-based ensemble weighting schemes have limitations in the context of high-impact extreme events. We introduce a locally time-invariant model evaluation method with focus on assessing the simulation of extremes. We explore the behaviour of the proposed method in predicting extreme heat days in Nairobi.
translated by 谷歌翻译
This paper presents an accurate, highly efficient, and learning-free method for large-scale odometry estimation using spinning radar, empirically found to generalize well across very diverse environments -- outdoors, from urban to woodland, and indoors in warehouses and mines - without changing parameters. Our method integrates motion compensation within a sweep with one-to-many scan registration that minimizes distances between nearby oriented surface points and mitigates outliers with a robust loss function. Extending our previous approach CFEAR, we present an in-depth investigation on a wider range of data sets, quantifying the importance of filtering, resolution, registration cost and loss functions, keyframe history, and motion compensation. We present a new solving strategy and configuration that overcomes previous issues with sparsity and bias, and improves our state-of-the-art by 38%, thus, surprisingly, outperforming radar SLAM and approaching lidar SLAM. The most accurate configuration achieves 1.09% error at 5Hz on the Oxford benchmark, and the fastest achieves 1.79% error at 160Hz.
translated by 谷歌翻译
Biological cortical networks are potentially fully recurrent networks without any distinct output layer, where recognition may instead rely on the distribution of activity across its neurons. Because such biological networks can have rich dynamics, they are well-designed to cope with dynamical interactions of the types that occur in nature, while traditional machine learning networks may struggle to make sense of such data. Here we connected a simple model neuronal network (based on the 'linear summation neuron model' featuring biologically realistic dynamics (LSM), consisting of 10 of excitatory and 10 inhibitory neurons, randomly connected) to a robot finger with multiple types of force sensors when interacting with materials of different levels of compliance. Scope: to explore the performance of the network on classification accuracy. Therefore, we compared the performance of the network output with principal component analysis of statistical features of the sensory data as well as its mechanical properties. Remarkably, even though the LSM was a very small and untrained network, and merely designed to provide rich internal network dynamics while the neuron model itself was highly simplified, we found that the LSM outperformed these other statistical approaches in terms of accuracy.
translated by 谷歌翻译
Recent 3D-based manipulation methods either directly predict the grasp pose using 3D neural networks, or solve the grasp pose using similar objects retrieved from shape databases. However, the former faces generalizability challenges when testing with new robot arms or unseen objects; and the latter assumes that similar objects exist in the databases. We hypothesize that recent 3D modeling methods provides a path towards building digital replica of the evaluation scene that affords physical simulation and supports robust manipulation algorithm learning. We propose to reconstruct high-quality meshes from real-world point clouds using state-of-the-art neural surface reconstruction method (the Real2Sim step). Because most simulators take meshes for fast simulation, the reconstructed meshes enable grasp pose labels generation without human efforts. The generated labels can train grasp network that performs robustly in the real evaluation scene (the Sim2Real step). In synthetic and real experiments, we show that the Real2Sim2Real pipeline performs better than baseline grasp networks trained with a large dataset and a grasp sampling method with retrieval-based reconstruction. The benefit of the Real2Sim2Real pipeline comes from 1) decoupling scene modeling and grasp sampling into sub-problems, and 2) both sub-problems can be solved with sufficiently high quality using recent 3D learning algorithms and mesh-based physical simulation techniques.
translated by 谷歌翻译